Overview

Dataset statistics

Number of variables11
Number of observations746
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory64.8 KiB
Average record size in memory89.0 B

Variable types

Numeric6
Categorical4
Boolean1

Alerts

Obs Code has constant value "654"Constant
Magnitude Obs Code has constant value "654"Constant
Obs Code Agreement has constant value "True"Constant
Date has a high cardinality: 112 distinct valuesHigh cardinality
Bulletin has a high cardinality: 235 distinct valuesHigh cardinality
Residual RA is highly overall correlated with Residual Dec and 1 other fieldsHigh correlation
Residual Dec is highly overall correlated with Residual RA and 1 other fieldsHigh correlation
Diagonal Residual is highly overall correlated with Residual RA and 1 other fieldsHigh correlation
Bulletin is uniformly distributedUniform
Unnamed: 0 has unique valuesUnique
Residual RA has 243 (32.6%) zerosZeros
Residual Dec has 270 (36.2%) zerosZeros
Diagonal Residual has 96 (12.9%) zerosZeros

Reproduction

Analysis started2022-12-16 21:34:48.243707
Analysis finished2022-12-16 21:34:53.959553
Duration5.72 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

Distinct746
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3908.3874
Minimum20
Maximum7389
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:54.040157image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile412.25
Q12138.5
median4189.5
Q35632.25
95-th percentile7036.75
Maximum7389
Range7369
Interquartile range (IQR)3493.75

Descriptive statistics

Standard deviation2117.0307
Coefficient of variation (CV)0.54166349
Kurtosis-1.1464872
Mean3908.3874
Median Absolute Deviation (MAD)1748
Skewness-0.18094796
Sum2915657
Variance4481819.2
MonotonicityStrictly increasing
2022-12-16T13:34:54.205328image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 1
 
0.1%
5272 1
 
0.1%
5211 1
 
0.1%
5212 1
 
0.1%
5213 1
 
0.1%
5230 1
 
0.1%
5231 1
 
0.1%
5232 1
 
0.1%
5253 1
 
0.1%
5254 1
 
0.1%
Other values (736) 736
98.7%
ValueCountFrequency (%)
20 1
0.1%
25 1
0.1%
26 1
0.1%
38 1
0.1%
39 1
0.1%
40 1
0.1%
41 1
0.1%
62 1
0.1%
63 1
0.1%
64 1
0.1%
ValueCountFrequency (%)
7389 1
0.1%
7388 1
0.1%
7379 1
0.1%
7351 1
0.1%
7350 1
0.1%
7324 1
0.1%
7323 1
0.1%
7322 1
0.1%
7303 1
0.1%
7302 1
0.1%

Date
Categorical

Distinct112
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
7/27/2022
 
19
8/30/2022
 
18
8/21/2022
 
18
6/28/2022
 
18
6/21/2022
 
18
Other values (107)
655 

Length

Max length10
Median length9
Mean length8.7453083
Min length8

Characters and Unicode

Total characters6524
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.4%

Sample

1st row1/7/2022
2nd row1/7/2022
3rd row1/7/2022
4th row1/13/2022
5th row1/13/2022

Common Values

ValueCountFrequency (%)
7/27/2022 19
 
2.5%
8/30/2022 18
 
2.4%
8/21/2022 18
 
2.4%
6/28/2022 18
 
2.4%
6/21/2022 18
 
2.4%
8/22/2022 17
 
2.3%
6/2/2022 16
 
2.1%
8/23/2022 16
 
2.1%
10/20/2022 15
 
2.0%
2/4/2022 14
 
1.9%
Other values (102) 577
77.3%

Length

2022-12-16T13:34:54.368733image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7/27/2022 19
 
2.5%
8/30/2022 18
 
2.4%
8/21/2022 18
 
2.4%
6/28/2022 18
 
2.4%
6/21/2022 18
 
2.4%
8/22/2022 17
 
2.3%
6/2/2022 16
 
2.1%
8/23/2022 16
 
2.1%
10/20/2022 15
 
2.0%
2/4/2022 14
 
1.9%
Other values (102) 577
77.3%

Most occurring characters

ValueCountFrequency (%)
2 2732
41.9%
/ 1492
22.9%
0 909
 
13.9%
1 295
 
4.5%
6 201
 
3.1%
8 195
 
3.0%
7 178
 
2.7%
3 146
 
2.2%
5 132
 
2.0%
4 127
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5032
77.1%
Other Punctuation 1492
 
22.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2732
54.3%
0 909
 
18.1%
1 295
 
5.9%
6 201
 
4.0%
8 195
 
3.9%
7 178
 
3.5%
3 146
 
2.9%
5 132
 
2.6%
4 127
 
2.5%
9 117
 
2.3%
Other Punctuation
ValueCountFrequency (%)
/ 1492
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6524
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2732
41.9%
/ 1492
22.9%
0 909
 
13.9%
1 295
 
4.5%
6 201
 
3.1%
8 195
 
3.0%
7 178
 
2.7%
3 146
 
2.2%
5 132
 
2.0%
4 127
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6524
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2732
41.9%
/ 1492
22.9%
0 909
 
13.9%
1 295
 
4.5%
6 201
 
3.1%
8 195
 
3.0%
7 178
 
2.7%
3 146
 
2.2%
5 132
 
2.0%
4 127
 
1.9%

Bulletin
Categorical

HIGH CARDINALITY
UNIFORM

Distinct235
Distinct (%)31.5%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
2022-L78
 
8
2022-Q63
 
7
2022-L71
 
7
2022-L09
 
7
2022-O73
 
7
Other values (230)
710 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters6714
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)1.2%

Sample

1st row2022-A62
2nd row2022-A62
3rd row2022-A62
4th row2022-A171
5th row2022-A171

Common Values

ValueCountFrequency (%)
2022-L78 8
 
1.1%
2022-Q63 7
 
0.9%
2022-L71 7
 
0.9%
2022-L09 7
 
0.9%
2022-O73 7
 
0.9%
2022-Q47 7
 
0.9%
2022-M25 7
 
0.9%
2022-S236 6
 
0.8%
2022-M89 6
 
0.8%
2022-Q88 6
 
0.8%
Other values (225) 678
90.9%

Length

2022-12-16T13:34:54.504445image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-l78 8
 
1.1%
2022-l71 7
 
0.9%
2022-l09 7
 
0.9%
2022-o73 7
 
0.9%
2022-q47 7
 
0.9%
2022-m25 7
 
0.9%
2022-q63 7
 
0.9%
2022-q136 6
 
0.8%
2022-m60 6
 
0.8%
2022-q195 6
 
0.8%
Other values (225) 678
90.9%

Most occurring characters

ValueCountFrequency (%)
2 2431
36.2%
0 901
 
13.4%
- 746
 
11.1%
516
 
7.7%
1 334
 
5.0%
3 190
 
2.8%
5 179
 
2.7%
7 155
 
2.3%
6 146
 
2.2%
8 131
 
2.0%
Other values (22) 985
14.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4706
70.1%
Dash Punctuation 746
 
11.1%
Uppercase Letter 746
 
11.1%
Space Separator 516
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Q 121
16.2%
M 74
9.9%
L 73
 
9.8%
C 64
 
8.6%
U 48
 
6.4%
S 47
 
6.3%
O 37
 
5.0%
G 36
 
4.8%
N 33
 
4.4%
K 29
 
3.9%
Other values (10) 184
24.7%
Decimal Number
ValueCountFrequency (%)
2 2431
51.7%
0 901
 
19.1%
1 334
 
7.1%
3 190
 
4.0%
5 179
 
3.8%
7 155
 
3.3%
6 146
 
3.1%
8 131
 
2.8%
4 120
 
2.5%
9 119
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 746
100.0%
Space Separator
ValueCountFrequency (%)
516
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5968
88.9%
Latin 746
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
Q 121
16.2%
M 74
9.9%
L 73
 
9.8%
C 64
 
8.6%
U 48
 
6.4%
S 47
 
6.3%
O 37
 
5.0%
G 36
 
4.8%
N 33
 
4.4%
K 29
 
3.9%
Other values (10) 184
24.7%
Common
ValueCountFrequency (%)
2 2431
40.7%
0 901
 
15.1%
- 746
 
12.5%
516
 
8.6%
1 334
 
5.6%
3 190
 
3.2%
5 179
 
3.0%
7 155
 
2.6%
6 146
 
2.4%
8 131
 
2.2%
Other values (2) 239
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6714
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2431
36.2%
0 901
 
13.4%
- 746
 
11.1%
516
 
7.7%
1 334
 
5.0%
3 190
 
2.8%
5 179
 
2.7%
7 155
 
2.3%
6 146
 
2.2%
8 131
 
2.0%
Other values (22) 985
14.7%

Obs Code
Categorical

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
654
746 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2238
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row654
2nd row654
3rd row654
4th row654
5th row654

Common Values

ValueCountFrequency (%)
654 746
100.0%

Length

2022-12-16T13:34:54.630018image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-16T13:34:54.756361image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
654 746
100.0%

Most occurring characters

ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 2238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Residual RA
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.11487936
Minimum0
Maximum1.8
Zeros243
Zeros (%)32.6%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:54.851875image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.1
Q30.1
95-th percentile0.3
Maximum1.8
Range1.8
Interquartile range (IQR)0.1

Descriptive statistics

Standard deviation0.15554554
Coefficient of variation (CV)1.3539903
Kurtosis34.248737
Mean0.11487936
Median Absolute Deviation (MAD)0.1
Skewness4.5426291
Sum85.7
Variance0.024194415
MonotonicityNot monotonic
2022-12-16T13:34:54.967703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0.1 338
45.3%
0 243
32.6%
0.2 89
 
11.9%
0.3 40
 
5.4%
0.4 13
 
1.7%
0.5 11
 
1.5%
0.7 4
 
0.5%
1.2 2
 
0.3%
0.6 2
 
0.3%
1.8 1
 
0.1%
Other values (3) 3
 
0.4%
ValueCountFrequency (%)
0 243
32.6%
0.1 338
45.3%
0.2 89
 
11.9%
0.3 40
 
5.4%
0.4 13
 
1.7%
0.5 11
 
1.5%
0.6 2
 
0.3%
0.7 4
 
0.5%
0.8 1
 
0.1%
0.9 1
 
0.1%
ValueCountFrequency (%)
1.8 1
 
0.1%
1.5 1
 
0.1%
1.2 2
 
0.3%
0.9 1
 
0.1%
0.8 1
 
0.1%
0.7 4
 
0.5%
0.6 2
 
0.3%
0.5 11
 
1.5%
0.4 13
 
1.7%
0.3 40
5.4%

Residual Dec
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.097184987
Minimum0
Maximum1
Zeros270
Zeros (%)36.2%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:55.099835image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.1
Q30.1
95-th percentile0.3
Maximum1
Range1
Interquartile range (IQR)0.1

Descriptive statistics

Standard deviation0.10684046
Coefficient of variation (CV)1.0993515
Kurtosis11.248024
Mean0.097184987
Median Absolute Deviation (MAD)0.1
Skewness2.3206378
Sum72.5
Variance0.011414884
MonotonicityNot monotonic
2022-12-16T13:34:55.219718image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0.1 311
41.7%
0 270
36.2%
0.2 117
 
15.7%
0.3 33
 
4.4%
0.4 6
 
0.8%
0.5 4
 
0.5%
0.7 3
 
0.4%
1 1
 
0.1%
0.6 1
 
0.1%
ValueCountFrequency (%)
0 270
36.2%
0.1 311
41.7%
0.2 117
 
15.7%
0.3 33
 
4.4%
0.4 6
 
0.8%
0.5 4
 
0.5%
0.6 1
 
0.1%
0.7 3
 
0.4%
1 1
 
0.1%
ValueCountFrequency (%)
1 1
 
0.1%
0.7 3
 
0.4%
0.6 1
 
0.1%
0.5 4
 
0.5%
0.4 6
 
0.8%
0.3 33
 
4.4%
0.2 117
 
15.7%
0.1 311
41.7%
0 270
36.2%

Num
Real number (ℝ)

Distinct71
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.912869
Minimum4
Maximum243
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:55.373812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile12
Q117
median21
Q330
95-th percentile53
Maximum243
Range239
Interquartile range (IQR)13

Descriptive statistics

Standard deviation25.017077
Coefficient of variation (CV)0.92955817
Kurtosis50.752003
Mean26.912869
Median Absolute Deviation (MAD)6
Skewness6.4591681
Sum20077
Variance625.85414
MonotonicityNot monotonic
2022-12-16T13:34:55.551942image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17 47
 
6.3%
19 45
 
6.0%
15 42
 
5.6%
20 41
 
5.5%
22 38
 
5.1%
21 38
 
5.1%
18 36
 
4.8%
23 32
 
4.3%
14 31
 
4.2%
16 30
 
4.0%
Other values (61) 366
49.1%
ValueCountFrequency (%)
4 1
 
0.1%
8 2
 
0.3%
9 8
 
1.1%
10 9
 
1.2%
11 17
2.3%
12 14
 
1.9%
13 26
3.5%
14 31
4.2%
15 42
5.6%
16 30
4.0%
ValueCountFrequency (%)
243 1
0.1%
242 1
0.1%
241 1
0.1%
240 1
0.1%
239 1
0.1%
238 1
0.1%
224 1
0.1%
223 1
0.1%
85 1
0.1%
84 1
0.1%

Magnitude
Real number (ℝ)

Distinct47
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.111394
Minimum15.9
Maximum22.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:55.722908image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum15.9
5-th percentile18.4
Q119.7
median20.3
Q320.7
95-th percentile21.2
Maximum22.1
Range6.2
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.86701923
Coefficient of variation (CV)0.043110847
Kurtosis2.251246
Mean20.111394
Median Absolute Deviation (MAD)0.5
Skewness-1.1366495
Sum15003.1
Variance0.75172235
MonotonicityNot monotonic
2022-12-16T13:34:55.877832image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
20.5 52
 
7.0%
20.6 48
 
6.4%
20.3 48
 
6.4%
20.9 47
 
6.3%
20.4 44
 
5.9%
20.2 41
 
5.5%
20.7 39
 
5.2%
20.1 34
 
4.6%
19.6 28
 
3.8%
19.7 26
 
3.5%
Other values (37) 339
45.4%
ValueCountFrequency (%)
15.9 2
 
0.3%
16.1 1
 
0.1%
16.4 1
 
0.1%
17.3 3
 
0.4%
17.4 1
 
0.1%
17.9 2
 
0.3%
18 2
 
0.3%
18.1 4
 
0.5%
18.2 10
1.3%
18.3 11
1.5%
ValueCountFrequency (%)
22.1 1
 
0.1%
21.9 2
 
0.3%
21.8 2
 
0.3%
21.7 1
 
0.1%
21.6 1
 
0.1%
21.5 7
0.9%
21.4 14
1.9%
21.3 6
 
0.8%
21.2 17
2.3%
21.1 17
2.3%
Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.7 KiB
654
746 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2238
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row654
2nd row654
3rd row654
4th row654
5th row654

Common Values

ValueCountFrequency (%)
654 746
100.0%

Length

2022-12-16T13:34:56.027792image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-16T13:34:56.163539image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
654 746
100.0%

Most occurring characters

ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 2238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 746
33.3%
5 746
33.3%
4 746
33.3%
Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
True
746 
ValueCountFrequency (%)
True 746
100.0%
2022-12-16T13:34:56.422865image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Diagonal Residual
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct29
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.17435326
Minimum0
Maximum2.059126
Zeros96
Zeros (%)12.9%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-12-16T13:34:56.531756image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.1
median0.14142136
Q30.2236068
95-th percentile0.41231056
Maximum2.059126
Range2.059126
Interquartile range (IQR)0.1236068

Descriptive statistics

Standard deviation0.16685927
Coefficient of variation (CV)0.95701839
Kurtosis32.670775
Mean0.17435326
Median Absolute Deviation (MAD)0.058578644
Skewness4.2442023
Sum130.06753
Variance0.027842017
MonotonicityNot monotonic
2022-12-16T13:34:56.664595image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
0.1 230
30.8%
0.1414213562 135
18.1%
0.2236067977 98
13.1%
0 96
12.9%
0.2 65
 
8.7%
0.316227766 36
 
4.8%
0.3 21
 
2.8%
0.2828427125 10
 
1.3%
0.3605551275 8
 
1.1%
0.4123105626 7
 
0.9%
Other values (19) 40
 
5.4%
ValueCountFrequency (%)
0 96
12.9%
0.1 230
30.8%
0.1414213562 135
18.1%
0.2 65
 
8.7%
0.2236067977 98
13.1%
0.2828427125 10
 
1.3%
0.3 21
 
2.8%
0.316227766 36
 
4.8%
0.3605551275 8
 
1.1%
0.4 5
 
0.7%
ValueCountFrequency (%)
2.059126028 1
 
0.1%
1.513274595 1
 
0.1%
1.236931688 1
 
0.1%
1.204159458 1
 
0.1%
0.9848857802 1
 
0.1%
0.8602325267 2
0.3%
0.8246211251 1
 
0.1%
0.7810249676 1
 
0.1%
0.7280109889 3
0.4%
0.7211102551 1
 
0.1%

Interactions

2022-12-16T13:34:52.770259image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:48.477853image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.321520image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.198576image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.014444image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.816969image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:52.911470image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:48.623383image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.469798image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.341396image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.154550image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.960060image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:53.050909image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:48.776726image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.614994image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.489285image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.292461image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:52.237696image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:53.183747image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:48.913737image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.751063image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.619130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.426236image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:52.370747image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:53.309418image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.045706image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.896204image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.746985image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.549851image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:52.504704image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:53.441210image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:49.184750image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.057567image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:50.890532image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:51.687916image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-16T13:34:52.639676image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-12-16T13:34:56.782978image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-16T13:34:56.972497image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-16T13:34:57.176149image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-16T13:34:57.377911image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-16T13:34:57.564820image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-16T13:34:53.619332image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-16T13:34:53.861668image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0DateBulletinObs CodeResidual RAResidual DecNumMagnitudeMagnitude Obs CodeObs Code AgreementDiagonal Residual
20201/7/20222022-A626540.40.12321.8654True0.412311
25251/7/20222022-A626540.30.22821.5654True0.360555
26261/7/20222022-A626540.10.12920.5654True0.141421
38381/13/20222022-A1716540.20.11320.9654True0.223607
39391/13/20222022-A1716540.10.21420.9654True0.223607
40401/13/20222022-A1716540.30.31518.4654True0.424264
41411/13/20222022-A1716541.81.01618.4654True2.059126
62621/13/20222022-A1736540.00.11220.6654True0.100000
63631/13/20222022-A1736540.10.11320.6654True0.141421
64641/13/20222022-A1736540.00.11421.1654True0.100000
Unnamed: 0DateBulletinObs CodeResidual RAResidual DecNumMagnitudeMagnitude Obs CodeObs Code AgreementDiagonal Residual
7302730210/25/20222022-U2506540.00.01621.0654True0.000000
7303730310/25/20222022-U2506540.10.01721.0654True0.100000
7322732210/26/20222022-U2806540.00.11919.5654True0.100000
7323732310/26/20222022-U2806540.00.12019.5654True0.100000
7324732410/26/20222022-U2806540.00.22119.5654True0.200000
7350735010/27/20222022-U2906540.00.11720.7654True0.100000
7351735110/27/20222022-U2906540.10.11820.7654True0.141421
7379737910/29/20222022-U3486540.10.12620.7654True0.141421
7388738810/29/20222022-U3486540.00.13620.3654True0.100000
7389738910/29/20222022-U3486540.10.23720.9654True0.223607